National Data Management Center¶

  • Data Analytics, Modeling and Visualization

1. Problem Solving Section¶

Preprocessing and EDA¶

In this section various activities are perforemd such as cleaning, transforming, and prepare data for analysis while gaining insights into its structure, relationships, and patterns through visualizations and statistical summaries.

Importing the relevant libraries¶

Below are all the requiered pacakges for the analysis and modeling

A. Loading the data¶

packet_version_id id_ver_nmb champs_id dp_001 dp_002 dp_003 dp_004 dp_005 dp_006 dp_007 ... dpf_012___ch00040 dpf_012___ch00041 dpf_012___ch00042 dpf_012___ch00043 dpf_012___ch01424 dpf_012___ch01875 dpf_012___ch00010 dpf_013 dpf_014 crf_060302_decode_panel_feedback_form_complete
0 ETAA00002_01_01 2.0.0 ETAA00002 5 1 2 3 4.0 5.0 6.0 ... 0 0 0 0 0 0 0 Tseyon Tesfaye Clinical None 2
1 ETAA00004_01_02 2.0.0 ETAA00004 5 1 2 3 4.0 5.0 6.0 ... 0 0 0 0 0 0 0 Adugna (SBS team), Tigistu (counselor), Tseyon... NaN 2

2 rows × 381 columns

B. Shape of the dataset¶

(444, 381)
champs_id dp_013 dp_108 dp_118
0 ETAA00002 CH00716 Undetermined Undetermined
1 ETAA00004 CH00716 Undetermined Undetermined
2 ETAA00005 CH00716 Intrauterine hypoxia Fetus and newborn affected by other forms of p...
3 ETAA00008 CH00719 Severe acute malnutrition - Kwashiorkor NaN
4 ETAA00009 CH01406 Sepsis NaN
(444, 4)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 444 entries, 0 to 443
Data columns (total 4 columns):
 #   Column     Non-Null Count  Dtype 
---  ------     --------------  ----- 
 0   champs_id  444 non-null    object
 1   dp_013     444 non-null    object
 2   dp_108     444 non-null    object
 3   dp_118     197 non-null    object
dtypes: object(4)
memory usage: 14.0+ KB
champs_id dp_013 dp_108 dp_118
count 444 444 444 197
unique 444 6 97 97
top ETAA00002 CH00716 Intrauterine hypoxia Preeclampsia
freq 1 239 148 36
champs_id      0
dp_013         0
dp_108         0
dp_118       247
dtype: int64

C. Enumerate the columns of the dataset¶

Column 0: champs_id
Column 1: dp_013
Column 2: dp_108
Column 3: dp_118

D. Rename columns¶

  • columns are renamed here according to the direction for better undersanding

Updating values¶

  • in this section some of the coded values are updated, particularly the value of Case Type.
CHAMPS_ID Case Type Underlying Cause Maternal Factor
399 ETAA01154 Stillbirth Intrauterine hypoxia NaN
334 ETAA01007 Stillbirth Intrauterine hypoxia Fetus and newborn affected by other malpresent...

Null propertion in each column¶

  • Null values in each column are identified

arround 55 % of the values in the Maternal Factor column are null

CHAMPS_ID           0.000000
Case Type           0.000000
Underlying Cause    0.000000
Maternal Factor     0.556306
dtype: float64

2. Descriptive Data analysis¶

  • Based on the given decoded table and the dictionary, descriptive data analysis are on the datasets

A. What are the magnitude and proportion of each of the infant underlying cause for child death?¶

Identify driving factor for child death:¶

Here the column is Underlying Cause used to find magnitude and proportion of each of the infant.

                                                  Magnitude  Proportion (%)
Intrauterine hypoxia                                    148       33.333333
Birth asphyxia                                           33        7.432432
Undetermined                                             28        6.306306
Severe acute malnutrition                                24        5.405405
Craniorachischisis                                       16        3.603604
...                                                     ...             ...
Severe acute malnutrition-Kwashiorkor                     1        0.225225
severe acute malnutrition, Marasmic Kwashiorkor           1        0.225225
Severe acute malnutrition - Marasmic kwashiorkor          1        0.225225
Congenital CMV infection                                  1        0.225225
Bacterial sepsis of the newborn                           1        0.225225

[97 rows x 2 columns]
No description has been provided for this image

*****Insight from the above | underlying cause*****

  • the above descriptive summary and the bar graph shows clear and precise information. accordingly here is few summary given below to make it simplify.
  • Intrauterine hypoxia, is the most and far highest underlying cause for infant death and it covers 33% of the total deaths.
  • Birth asphyxia is the second most underlying cause the for the infant death and it covers 7% of the total deaths.
  • 6% of the infant death is Undetermined their underlying cuses which is ranked thirdly accordingto the given dataset.
  • Next, Severe acute malnutrition is the cause for the infant death which is 5%.
  • In summary considering the magnitude and proportion of the underlying cause, there should be a special attention to reduce the infant death caused by Intrauterine hypoxia.

B. What are the proportion and magnitude of the maternal factors contributing for child death?¶

Identify driving factor to for child death:¶

Here the column is Maternal Factor used to find magnitude and proportion of each of the infant death.

                                                    Magnitude  Proportion (%)
Preeclampsia                                               36       18.274112
Twin pregnancy                                             12        6.091371
Fetus and newborn affected by other forms of pl...         11        5.583756
Eclampsia                                                   9        4.568528
Fetus and newborn affected by other forms of pl...          5        2.538071
...                                                       ...             ...
Fetus and newborn affected by oligohydramnios               1        0.507614
Fetus and newborn affected by maternal diabetes             1        0.507614
Fetus and newborn affected by maternal infectio...          1        0.507614
Fetus and newborn affected by multiple pregnanc...          1        0.507614
Pre-labor rapture of membrane                               1        0.507614

[97 rows x 2 columns]
No description has been provided for this image

*****Insight from the above | Maternal Factor*****

  • The above descriptive summary and the bar graph shows clear and precise information about the contribution of Maternal Factor to infant death . Accordingly here is few summary given below to make it understandable.
  • Preeclampsia, is the most and far highest maternal factor for infant death and it covers 18% of the total deaths contribution.
  • Twin pregnancy is the second most maternal factor the for the infant death and it covers 6% of the total deaths.
  • Fetus and newborn affected by other forms contributes 5% of the infant death which is ranked thirdly accordingto the given dataset.
  • Next, Eclampsia is the maternal factor for the infant death which contributes around 4%.
  • In summary considering the magnitude and proportion of the Maternal Factor contribution to infant death, it requiers a spect special attention to reduce the infant death contributed by Preeclampsia.

C.What are the proportion of the child death by the case type¶

                                          Magnitude  Proportion (%)
Stillbirth                                      239       53.828829
Death in the first 24 hours                      69       15.540541
Early Neonate (1 to 6 days)                      49       11.036036
Child (12 months to less than 60 months)         42        9.459459
Infant (28 days to less than 12 months)          27        6.081081
Late Neonate (7 to 27 days)                      18        4.054054
No description has been provided for this image

*****Insight from the above | Case Type*****

  • The above descriptive summary and the pie chart shows clear and precise information about the case type in relation to infant death . Accordingly, here is few summary given below to make it ease.
  • Stillbirth, is the first most highest case type in infant death and it accounts 53% of the total deaths cases.
  • Death in the first 24 hours is the second most case type in the infant death and it accounts 15% of the total deaths cases.
  • Early Neonate (1 to 6 days) accounts 11% in the infant death which is ranked thirdly accordingto the given dataset.
  • Next, Child (12 months to less than 60 months) accounts 9% in the infant death.
  • In summary considering the magnitude and proportion of the case type in relation to infant death, it requiers a specila research and study to mitigate the problem behind the case type Stillbirth.

3. Correlation analysis¶

Using correlation or Heat Maps, show how each of the infant underlying conditions and maternal factors are correlated to the top three causes of the child death identified above under 2(A)

Prepare Data for Correlation Analysis:¶

  • First Create dummy variables for categorical columns.
  • Choose appropriet encoding technique, like one-hot encoding.
  • Apply correlation matrix
                                       Underlying Cause_Birth asphyxia  \
Underlying Cause_Birth asphyxia                               1.000000   
Underlying Cause_Intrauterine hypoxia                        -0.674476   
Underlying Cause_Undetermined                                -0.170310   
Maternal Factor_Abruptio placenta                             0.160128   
Maternal Factor_Abruption placenta                            0.058061   
...                                                                ...   
Maternal Factor_Severe preeclampsia                          -0.030024   
Maternal Factor_Twin pregnancy                               -0.052255   
Maternal Factor_Undetermined                                 -0.042563   
Maternal Factor_Uterine rupture                               0.092219   
Maternal Factor_preeclampsia                                 -0.030024   

                                       Underlying Cause_Intrauterine hypoxia  \
Underlying Cause_Birth asphyxia                                    -0.674476   
Underlying Cause_Intrauterine hypoxia                               1.000000   
Underlying Cause_Undetermined                                      -0.612640   
Maternal Factor_Abruptio placenta                                  -0.108003   
Maternal Factor_Abruption placenta                                 -0.011007   
...                                                                      ...   
Maternal Factor_Severe preeclampsia                                 0.044515   
Maternal Factor_Twin pregnancy                                      0.077475   
Maternal Factor_Undetermined                                       -0.153107   
Maternal Factor_Uterine rupture                                    -0.045001   
Maternal Factor_preeclampsia                                        0.044515   

                                       Underlying Cause_Undetermined  \
Underlying Cause_Birth asphyxia                            -0.170310   
Underlying Cause_Intrauterine hypoxia                      -0.612640   
Underlying Cause_Undetermined                               1.000000   
Maternal Factor_Abruptio placenta                          -0.027271   
Maternal Factor_Abruption placenta                         -0.047464   
...                                                              ...   
Maternal Factor_Severe preeclampsia                        -0.027271   
Maternal Factor_Twin pregnancy                             -0.047464   
Maternal Factor_Undetermined                                0.249914   
Maternal Factor_Uterine rupture                            -0.038661   
Maternal Factor_preeclampsia                               -0.027271   

                                       Maternal Factor_Abruptio placenta  \
Underlying Cause_Birth asphyxia                                 0.160128   
Underlying Cause_Intrauterine hypoxia                          -0.108003   
Underlying Cause_Undetermined                                  -0.027271   
Maternal Factor_Abruptio placenta                               1.000000   
Maternal Factor_Abruption placenta                             -0.008367   
...                                                                  ...   
Maternal Factor_Severe preeclampsia                            -0.004808   
Maternal Factor_Twin pregnancy                                 -0.008367   
Maternal Factor_Undetermined                                   -0.006816   
Maternal Factor_Uterine rupture                                -0.006816   
Maternal Factor_preeclampsia                                   -0.004808   

                                       Maternal Factor_Abruption placenta  \
Underlying Cause_Birth asphyxia                                  0.058061   
Underlying Cause_Intrauterine hypoxia                           -0.011007   
Underlying Cause_Undetermined                                   -0.047464   
Maternal Factor_Abruptio placenta                               -0.008367   
Maternal Factor_Abruption placenta                               1.000000   
...                                                                   ...   
Maternal Factor_Severe preeclampsia                             -0.008367   
Maternal Factor_Twin pregnancy                                  -0.014563   
Maternal Factor_Undetermined                                    -0.011862   
Maternal Factor_Uterine rupture                                 -0.011862   
Maternal Factor_preeclampsia                                    -0.008367   

                                       Maternal Factor_Antepartum hemorrhage  \
Underlying Cause_Birth asphyxia                                     0.092219   
Underlying Cause_Intrauterine hypoxia                              -0.045001   
Underlying Cause_Undetermined                                      -0.038661   
Maternal Factor_Abruptio placenta                                  -0.006816   
Maternal Factor_Abruption placenta                                 -0.011862   
...                                                                      ...   
Maternal Factor_Severe preeclampsia                                -0.006816   
Maternal Factor_Twin pregnancy                                     -0.011862   
Maternal Factor_Undetermined                                       -0.009662   
Maternal Factor_Uterine rupture                                    -0.009662   
Maternal Factor_preeclampsia                                       -0.006816   

                                       Maternal Factor_Chorioamnionitis  \
Underlying Cause_Birth asphyxia                                0.092219   
Underlying Cause_Intrauterine hypoxia                         -0.045001   
Underlying Cause_Undetermined                                 -0.038661   
Maternal Factor_Abruptio placenta                             -0.006816   
Maternal Factor_Abruption placenta                            -0.011862   
...                                                                 ...   
Maternal Factor_Severe preeclampsia                           -0.006816   
Maternal Factor_Twin pregnancy                                -0.011862   
Maternal Factor_Undetermined                                  -0.009662   
Maternal Factor_Uterine rupture                               -0.009662   
Maternal Factor_preeclampsia                                  -0.006816   

                                       Maternal Factor_Cord prolapse  \
Underlying Cause_Birth asphyxia                            -0.030024   
Underlying Cause_Intrauterine hypoxia                       0.044515   
Underlying Cause_Undetermined                              -0.027271   
Maternal Factor_Abruptio placenta                          -0.004808   
Maternal Factor_Abruption placenta                         -0.008367   
...                                                              ...   
Maternal Factor_Severe preeclampsia                        -0.004808   
Maternal Factor_Twin pregnancy                             -0.008367   
Maternal Factor_Undetermined                               -0.006816   
Maternal Factor_Uterine rupture                            -0.006816   
Maternal Factor_preeclampsia                               -0.004808   

                                       Maternal Factor_Eclampsia  \
Underlying Cause_Birth asphyxia                        -0.007677   
Underlying Cause_Intrauterine hypoxia                   0.061015   
Underlying Cause_Undetermined                          -0.073217   
Maternal Factor_Abruptio placenta                      -0.012907   
Maternal Factor_Abruption placenta                     -0.022465   
...                                                          ...   
Maternal Factor_Severe preeclampsia                    -0.012907   
Maternal Factor_Twin pregnancy                         -0.022465   
Maternal Factor_Undetermined                           -0.018298   
Maternal Factor_Uterine rupture                        -0.018298   
Maternal Factor_preeclampsia                           -0.012907   

                                       Maternal Factor_Eclampsia /HELLP Syndrome   \
Underlying Cause_Birth asphyxia                                         -0.030024   
Underlying Cause_Intrauterine hypoxia                                    0.044515   
Underlying Cause_Undetermined                                           -0.027271   
Maternal Factor_Abruptio placenta                                       -0.004808   
Maternal Factor_Abruption placenta                                      -0.008367   
...                                                                           ...   
Maternal Factor_Severe preeclampsia                                     -0.004808   
Maternal Factor_Twin pregnancy                                          -0.008367   
Maternal Factor_Undetermined                                            -0.006816   
Maternal Factor_Uterine rupture                                         -0.006816   
Maternal Factor_preeclampsia                                            -0.004808   

                                       ...  \
Underlying Cause_Birth asphyxia        ...   
Underlying Cause_Intrauterine hypoxia  ...   
Underlying Cause_Undetermined          ...   
Maternal Factor_Abruptio placenta      ...   
Maternal Factor_Abruption placenta     ...   
...                                    ...   
Maternal Factor_Severe preeclampsia    ...   
Maternal Factor_Twin pregnancy         ...   
Maternal Factor_Undetermined           ...   
Maternal Factor_Uterine rupture        ...   
Maternal Factor_preeclampsia           ...   

                                       Maternal Factor_Pre-labour preterm rupture of membranes  \
Underlying Cause_Birth asphyxia                                                -0.030024         
Underlying Cause_Intrauterine hypoxia                                           0.044515         
Underlying Cause_Undetermined                                                  -0.027271         
Maternal Factor_Abruptio placenta                                              -0.004808         
Maternal Factor_Abruption placenta                                             -0.008367         
...                                                                                  ...         
Maternal Factor_Severe preeclampsia                                            -0.004808         
Maternal Factor_Twin pregnancy                                                 -0.008367         
Maternal Factor_Undetermined                                                   -0.006816         
Maternal Factor_Uterine rupture                                                -0.006816         
Maternal Factor_preeclampsia                                                   -0.004808         

                                       Maternal Factor_Precipitated labour  \
Underlying Cause_Birth asphyxia                                  -0.030024   
Underlying Cause_Intrauterine hypoxia                             0.044515   
Underlying Cause_Undetermined                                    -0.027271   
Maternal Factor_Abruptio placenta                                -0.004808   
Maternal Factor_Abruption placenta                               -0.008367   
...                                                                    ...   
Maternal Factor_Severe preeclampsia                              -0.004808   
Maternal Factor_Twin pregnancy                                   -0.008367   
Maternal Factor_Undetermined                                     -0.006816   
Maternal Factor_Uterine rupture                                  -0.006816   
Maternal Factor_preeclampsia                                     -0.004808   

                                       Maternal Factor_Preeclampsia  \
Underlying Cause_Birth asphyxia                           -0.032492   
Underlying Cause_Intrauterine hypoxia                      0.132202   
Underlying Cause_Undetermined                             -0.141664   
Maternal Factor_Abruptio placenta                         -0.024974   
Maternal Factor_Abruption placenta                        -0.043466   
...                                                             ...   
Maternal Factor_Severe preeclampsia                       -0.024974   
Maternal Factor_Twin pregnancy                            -0.043466   
Maternal Factor_Undetermined                              -0.035404   
Maternal Factor_Uterine rupture                           -0.035404   
Maternal Factor_preeclampsia                              -0.024974   

                                       Maternal Factor_Premature rupture of membranes, onset of labour after 24 hours  \
Underlying Cause_Birth asphyxia                                                -0.030024                                
Underlying Cause_Intrauterine hypoxia                                           0.044515                                
Underlying Cause_Undetermined                                                  -0.027271                                
Maternal Factor_Abruptio placenta                                              -0.004808                                
Maternal Factor_Abruption placenta                                             -0.008367                                
...                                                                                  ...                                
Maternal Factor_Severe preeclampsia                                            -0.004808                                
Maternal Factor_Twin pregnancy                                                 -0.008367                                
Maternal Factor_Undetermined                                                   -0.006816                                
Maternal Factor_Uterine rupture                                                -0.006816                                
Maternal Factor_preeclampsia                                                   -0.004808                                

                                       Maternal Factor_Prolonged pregnancy  \
Underlying Cause_Birth asphyxia                                   0.160128   
Underlying Cause_Intrauterine hypoxia                            -0.108003   
Underlying Cause_Undetermined                                    -0.027271   
Maternal Factor_Abruptio placenta                                -0.004808   
Maternal Factor_Abruption placenta                               -0.008367   
...                                                                    ...   
Maternal Factor_Severe preeclampsia                              -0.004808   
Maternal Factor_Twin pregnancy                                   -0.008367   
Maternal Factor_Undetermined                                     -0.006816   
Maternal Factor_Uterine rupture                                  -0.006816   
Maternal Factor_preeclampsia                                     -0.004808   

                                       Maternal Factor_Severe preeclampsia  \
Underlying Cause_Birth asphyxia                                  -0.030024   
Underlying Cause_Intrauterine hypoxia                             0.044515   
Underlying Cause_Undetermined                                    -0.027271   
Maternal Factor_Abruptio placenta                                -0.004808   
Maternal Factor_Abruption placenta                               -0.008367   
...                                                                    ...   
Maternal Factor_Severe preeclampsia                               1.000000   
Maternal Factor_Twin pregnancy                                   -0.008367   
Maternal Factor_Undetermined                                     -0.006816   
Maternal Factor_Uterine rupture                                  -0.006816   
Maternal Factor_preeclampsia                                     -0.004808   

                                       Maternal Factor_Twin pregnancy  \
Underlying Cause_Birth asphyxia                             -0.052255   
Underlying Cause_Intrauterine hypoxia                        0.077475   
Underlying Cause_Undetermined                               -0.047464   
Maternal Factor_Abruptio placenta                           -0.008367   
Maternal Factor_Abruption placenta                          -0.014563   
...                                                               ...   
Maternal Factor_Severe preeclampsia                         -0.008367   
Maternal Factor_Twin pregnancy                               1.000000   
Maternal Factor_Undetermined                                -0.011862   
Maternal Factor_Uterine rupture                             -0.011862   
Maternal Factor_preeclampsia                                -0.008367   

                                       Maternal Factor_Undetermined  \
Underlying Cause_Birth asphyxia                           -0.042563   
Underlying Cause_Intrauterine hypoxia                     -0.153107   
Underlying Cause_Undetermined                              0.249914   
Maternal Factor_Abruptio placenta                         -0.006816   
Maternal Factor_Abruption placenta                        -0.011862   
...                                                             ...   
Maternal Factor_Severe preeclampsia                       -0.006816   
Maternal Factor_Twin pregnancy                            -0.011862   
Maternal Factor_Undetermined                               1.000000   
Maternal Factor_Uterine rupture                           -0.009662   
Maternal Factor_preeclampsia                              -0.006816   

                                       Maternal Factor_Uterine rupture  \
Underlying Cause_Birth asphyxia                               0.092219   
Underlying Cause_Intrauterine hypoxia                        -0.045001   
Underlying Cause_Undetermined                                -0.038661   
Maternal Factor_Abruptio placenta                            -0.006816   
Maternal Factor_Abruption placenta                           -0.011862   
...                                                                ...   
Maternal Factor_Severe preeclampsia                          -0.006816   
Maternal Factor_Twin pregnancy                               -0.011862   
Maternal Factor_Undetermined                                 -0.009662   
Maternal Factor_Uterine rupture                               1.000000   
Maternal Factor_preeclampsia                                 -0.006816   

                                       Maternal Factor_preeclampsia  
Underlying Cause_Birth asphyxia                           -0.030024  
Underlying Cause_Intrauterine hypoxia                      0.044515  
Underlying Cause_Undetermined                             -0.027271  
Maternal Factor_Abruptio placenta                         -0.004808  
Maternal Factor_Abruption placenta                        -0.008367  
...                                                             ...  
Maternal Factor_Severe preeclampsia                       -0.004808  
Maternal Factor_Twin pregnancy                            -0.008367  
Maternal Factor_Undetermined                              -0.006816  
Maternal Factor_Uterine rupture                           -0.006816  
Maternal Factor_preeclampsia                               1.000000  

[63 rows x 63 columns]
                                       Underlying Cause_Birth asphyxia  \
Underlying Cause_Birth asphyxia                               1.000000   
Underlying Cause_Intrauterine hypoxia                        -0.674476   
Underlying Cause_Undetermined                                -0.170310   
Maternal Factor_Abruptio placenta                             0.160128   
Maternal Factor_Abruption placenta                            0.058061   
...                                                                ...   
Maternal Factor_Severe preeclampsia                          -0.030024   
Maternal Factor_Twin pregnancy                               -0.052255   
Maternal Factor_Undetermined                                 -0.042563   
Maternal Factor_Uterine rupture                               0.092219   
Maternal Factor_preeclampsia                                 -0.030024   

                                       Underlying Cause_Intrauterine hypoxia  \
Underlying Cause_Birth asphyxia                                    -0.674476   
Underlying Cause_Intrauterine hypoxia                               1.000000   
Underlying Cause_Undetermined                                      -0.612640   
Maternal Factor_Abruptio placenta                                  -0.108003   
Maternal Factor_Abruption placenta                                 -0.011007   
...                                                                      ...   
Maternal Factor_Severe preeclampsia                                 0.044515   
Maternal Factor_Twin pregnancy                                      0.077475   
Maternal Factor_Undetermined                                       -0.153107   
Maternal Factor_Uterine rupture                                    -0.045001   
Maternal Factor_preeclampsia                                        0.044515   

                                       Underlying Cause_Undetermined  \
Underlying Cause_Birth asphyxia                            -0.170310   
Underlying Cause_Intrauterine hypoxia                      -0.612640   
Underlying Cause_Undetermined                               1.000000   
Maternal Factor_Abruptio placenta                          -0.027271   
Maternal Factor_Abruption placenta                         -0.047464   
...                                                              ...   
Maternal Factor_Severe preeclampsia                        -0.027271   
Maternal Factor_Twin pregnancy                             -0.047464   
Maternal Factor_Undetermined                                0.249914   
Maternal Factor_Uterine rupture                            -0.038661   
Maternal Factor_preeclampsia                               -0.027271   

                                       Maternal Factor_Undetermined  
Underlying Cause_Birth asphyxia                           -0.042563  
Underlying Cause_Intrauterine hypoxia                     -0.153107  
Underlying Cause_Undetermined                              0.249914  
Maternal Factor_Abruptio placenta                         -0.006816  
Maternal Factor_Abruption placenta                        -0.011862  
...                                                             ...  
Maternal Factor_Severe preeclampsia                       -0.006816  
Maternal Factor_Twin pregnancy                            -0.011862  
Maternal Factor_Undetermined                               1.000000  
Maternal Factor_Uterine rupture                           -0.009662  
Maternal Factor_preeclampsia                              -0.006816  

[63 rows x 4 columns]
No description has been provided for this image

*****Insight from the above | correlation Analysis*****

  • Correlation analysis has made on infant underlying conditions and maternal factors how they are correlated to the top three causes of the child death.
  • In most case the correlation result shows negative realtion among the variables. Some positive relation few with no realtions are observed. here are few illustrations:
    • Maternal Factor_Abruptio placenta \ Underlying Cause_Birth asphyxia 0.160128
    • Maternal Factor_Antepartum hemorrhage \Underlying Cause_Birth asphyxia 0.092219
    • Maternal Factor_Prolonged pregnancy \ Underlying Cause_Birth asphyxia 0.160128

4.Feature engineering¶

You are expected to select the top infant underlying causes and maternal factors(features) that would contribute to the top three causes of child death identified under 2(A) above. For this, you need to select the best and likely features. In doing so:

  • A. Select the classification models LogisticRegression, Support Vector Machine, AdaBoostClassifier, Random Forest Classifier , Gradient Boosting Classifier and XGBOOST and train each on the dataset
  • B. Import the appropriate package for each of the classification models above
  • C. Rank the features based on their importance for each of the top underlying causes of child death identified above under 2(A), for each of the classification algorithms under (A )

Step 1¶

  • encoding and spliting data as target and feature

Step 2: Train Classification Models¶

  • Identifying classifiers
  • Train each classifier and collect feature importances
[ 0.29264313  0.20315778 -0.09172598  0.20315778 -0.09172598  0.19656664
 -0.09172598 -0.09172598  0.         -0.12947367  0.          0.
 -0.12947367  0.         -0.09172598 -0.09172598 -0.09172598 -0.09172598
 -0.09172598 -0.12947367 -0.09172598 -0.09172598  0.         -0.09172598
 -0.09172598 -0.09172598  0.29264313  0.         -0.09172598  0.
  0.20315778 -0.33920149 -0.12947367 -0.09172598  0.29264313 -0.09172598
 -0.09172598 -0.09172598 -0.12947367 -0.09172598 -0.09172598  0.
 -0.12947367  0.          0.          0.29264313 -0.09172598 -0.09172598
  0.         -0.09430552 -0.09172598 -0.09172598  0.28805891 -0.09172598
  0.29264313 -0.09172598 -0.12947367 -0.13310848  0.20315778  0.        ]
[ 1.64953350e-01  8.88178420e-16 -2.22044605e-16 -8.88178420e-16
 -2.22044605e-16 -8.88178420e-16 -2.22044605e-16 -2.22044605e-16
  0.00000000e+00  4.44089210e-16  0.00000000e+00  0.00000000e+00
  4.44089210e-16  0.00000000e+00 -2.22044605e-16  2.22044605e-16
 -2.22044605e-16 -2.22044605e-16 -2.22044605e-16  4.44089210e-16
 -2.22044605e-16 -2.22044605e-16  0.00000000e+00 -2.22044605e-16
 -2.22044605e-16 -2.22044605e-16  1.64953350e-01  0.00000000e+00
 -2.22044605e-16  0.00000000e+00  1.77635684e-15  8.88178420e-16
  4.44089210e-16 -2.22044605e-16  1.64953350e-01 -2.22044605e-16
 -2.22044605e-16 -2.22044605e-16  4.44089210e-16 -2.22044605e-16
 -2.22044605e-16  0.00000000e+00  4.44089210e-16  0.00000000e+00
  0.00000000e+00  1.64953350e-01 -2.22044605e-16 -2.22044605e-16
  0.00000000e+00 -2.22044605e-16 -2.22044605e-16 -2.22044605e-16
  4.44089210e-16 -2.22044605e-16  1.64953350e-01 -2.22044605e-16
  4.44089210e-16  4.44089210e-16  8.88178420e-16  0.00000000e+00]

Step 3: Rank Features Based on Importance¶

  • Ranking features
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

****Insight from the above | Feature engineering****

  • Feature engineering and feature importance has been made accordingly and ranked as well. So, the feature importance (3) for each classifier is ranked as follows in decreasing order:
    • Logistic Regression
    • Maternal Factor_Fetus and newborn affected
    • Maternal Factor_Prolonged pregnancy
    • Maternal Factor_Abruptio placenta
    • Support Vector Machine
    • Maternal Factor_Abruptio placenta
    • Maternal Factor_Fetus and newborn affected
    • Maternal Factor_Prolonged pregnancy
    • AdaBoost
    • Maternal Factor_Eclampsia
    • Maternal Factor_Preeclampsia
    • Maternal Factor_Abruptio placenta
    • Random Forest
    • Maternal Factor_Undetermined
    • Maternal Factor_Preeclampsia
    • Maternal Factor_Fetus and newborn affected
    • Gradient Boosting
    • Maternal Factor_Undetermined
    • Maternal Factor_Preeclampsia
    • Maternal Factor_Prolonged pregnancy
    • XGBoost
    • Maternal Factor_Preeclampsia
    • Maternal Factor_Fetus and newborn affected
    • Maternal Factor_Eclampsia
    • to summarize, there are features which a zero and negative importance levels as it is shown in the above graph

5. Model evaluation using the proper metrics¶

  • A. Import the appropriate evaluation metric packages
  • B. Using the appropriate n-fold cross validation and out of sample data, select the best preforming model from the candidate models under 4(A)
  • C. Ensemble the models and see the performance of the combination models on the data
  • D. Use Accuracy score metrics to evaluate the performance of the models above
  • E. Plot the AUC and ROC curve on the same graph to visualize and compare the performance of each of the models above
Logistic Regression: 0.6575 ± 0.0307
Support Vector Machine: 0.6575 ± 0.0307
AdaBoost: 0.6575 ± 0.0307
Random Forest: 0.6575 ± 0.0307
Gradient Boosting: 0.6575 ± 0.0307
XGBoost: 0.6644 ± 0.0098
Best Model: XGBoost
VotingClassifier(estimators=[('lr', LogisticRegression(max_iter=1000)),
                             ('svm', SVC(kernel='linear', probability=True)),
                             ('adb', AdaBoostClassifier()),
                             ('rf', RandomForestClassifier()),
                             ('gb', GradientBoostingClassifier()),
                             ('xgb',
                              XGBClassifier(base_score=0.5, booster='gbtree',
                                            colsample_bylevel=1,
                                            colsample_bynode=1,
                                            colsample_bytree=1,
                                            enable_categorical=False,
                                            e...
                                            learning_rate=0.300000012,
                                            max_delta_step=0, max_depth=6,
                                            min_child_weight=1, missing=nan,
                                            monotone_constraints='()',
                                            n_estimators=100, n_jobs=8,
                                            num_parallel_tree=1,
                                            objective='multi:softprob',
                                            predictor='auto', random_state=0,
                                            reg_alpha=0, reg_lambda=1,
                                            scale_pos_weight=None, subsample=1,
                                            tree_method='exact',
                                            use_label_encoder=False,
                                            validate_parameters=1, ...))],
                 voting='soft')
Ensemble Model Accuracy: 0.8095
  • Ensemble model accuracy is far better (81%) than the individual classifiers (highest XGBoost 66%) because it leverages the best featurs of each classifer and it is recommended to apply it.

Accuracy Score Metrics to Evaluate the Performance of the Models¶

Logistic Regression Test Accuracy: 0.8095
Support Vector Machine Test Accuracy: 0.8095
AdaBoost Test Accuracy: 0.7778
Random Forest Test Accuracy: 0.8095
Gradient Boosting Test Accuracy: 0.8095
XGBoost Test Accuracy: 0.8095
Ensemble Model Test Accuracy: 0.8095
  • As we can see in the above result almost all the calssifier models score 81% accuracy score except AdaBOost (78%).

Plot the AUC and ROC Curve to Visualize and Compare Performance¶

No description has been provided for this image

6. Result Visualization¶

A. Plot the feature importance in descending order for each of the models using horizontal bar chart¶

No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
  • Insight about the above visualization are described under sectoin 4(feature engineering)

B. Plot the top five infant underlying causes of the child death¶

No description has been provided for this image

****Insight from the above****

  • Intrauterine hypoxia, is the most and far highest underlying cause for infant death.
  • Birth asphyxia is the second most underlying cause the for the infant death.
  • Undetermined is third underlying cuses which is ranked thirdly accordingto the given dataset.

C. Plot the top five maternal factors contributing to the child death¶

No description has been provided for this image

*****Insight from the above | Maternal Factor*****

  • The above descriptive summary and the bar graph shows clear and precise information about the contribution of Maternal Factor to infant death . Accordingly here is few summary given below to make it understandable.
  • Preeclampsia, is the most and far highest maternal factor for infant death and it covers 18% of the total deaths contribution.
  • Twin pregnancy is the second most maternal factor the for the infant death and it covers 6% of the total deaths.
  • Fetus and newborn affected by other forms contributes 5% of the infant death which is ranked thirdly accordingto the given dataset.
  • Next, Eclampsia is the maternal factor for the infant death which contributes around 4%.
  • In summary considering the magnitude and proportion of the Maternal Factor contribution to infant death, it requiers a spect special attention to reduce the infant death contributed by Preeclampsia.

D. Plot the child death based on the case types¶

No description has been provided for this image
No description has been provided for this image

*****Insight from the above | Case Type*****

  • The above descriptive summary and the pie chart shows clear and precise information about the case type in relation to infant death . Accordingly, here is few summary given below to make it ease.
  • Stillbirth, is the first most highest case type in infant death and it accounts 53% of the total deaths cases.
  • Death in the first 24 hours is the second most case type in the infant death and it accounts 15% of the total deaths cases.
  • Early Neonate (1 to 6 days) accounts 11% in the infant death which is ranked thirdly accordingto the given dataset.
  • Next, Child (12 months to less than 60 months) accounts 9% in the infant death.
  • In summary considering the magnitude and proportion of the case type in relation to infant death, it requiers a specila research and study to mitigate the problem behind the case type Stillbirth.